How Humans Describe Short Videos - Details of an Experiment
نویسندگان
چکیده
Human vision can be used as a model for computer vision. We have conducted an experiment to investigate several properties of human vision that can be applied to, and that can improve computer vision. This report describes in detail the description of videos done by human subjects. Human descriptions of videos show the importance of higher levels of abstraction and that features of an object related to a task can raise the object’s relevance. 1This work was partially supported by the Austrian Science Fund under grants P18716-N13, S9103-N04 and USA AFOSR.
منابع مشابه
How Humans Describe Short Videos
Recognition, manipulation and representation of visual objects can be simplified significantly by “abstraction”. By definition abstraction extracts essential features and properties while it neglects unnecessary details. We have conducted two sets of experiments in order to relate abstraction levels used by humans when describing videos, to abstraction level categories used in computer vision. ...
متن کاملStructuring Personal Activity Records Based on Attention - Analyzing Videos from Head-Mounted Camera
This paper introduces a novel method for analyzing video records which contain personal activities captured by a head mounted camera. This aims to support the user to retrieve the most important or relevant portions from the videos. For this purpose, we use the user’s behaviors which appear when he/she pays attention to something. We define two types of those behaviors, one of which is “gaze at...
متن کاملLabeling and modeling large databases of videos
As humans, we can say many things about the scenes surrounding us. For instance, we can tell what type of scene and location an image depicts, describe what objects live in it, their material properties, or their spatial arrangement. These comprise descriptions of a scene and are majorly studied areas in computer vision. This thesis, however, hypotheses that observers have an inherent prior kno...
متن کاملLabeling and modeling large databases of videos
As humans, we can say many things about the scenes surrounding us. For instance, we can tell what type of scene and location an image depicts, describe what objects live in it, their material properties, or their spatial arrangement. These comprise descriptions of a scene and are majorly studied areas in computer vision. This thesis, however, hypotheses that observers have an inherent prior kno...
متن کاملStructuring Personal Experiences -- Analyzing Views from a Head-Mounted Camera
This paper introduces a novel method for analyzing video records which contain personal activities captured by a head-mounted camera. This aims to support the user to retrieve the most important or relevant portions from the videos. For this purpose, we use the user’s behaviors which appear when he/she pays attention to something. We define two types of those behaviors, one of which is “gaze at...
متن کامل